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ABSTRACT 

The evolution of the Internet during the last years, has lead 
to a dramatic increase of the size of its graph at the Au- 
tonomous System (AS) level. Soon - if not already - its size 
will make the latter impractical for use from the research 
community, e.g. for protocol testing (AS level routing proto- 
cols). Reproducing a smaller size, snapshot of the AS graph 
is thus important. However, the first step towards this direc- 
tion is to obtain the ability to faithfully reproduce the full AS 
topology. The objective of our work, is to create a genera- 
tor able to accurately emulate and reproduce the distinctive 
properties of the Internet graph. Our approach is based on 
(a) the identification of the jellyfish-like structure [ 1 1 of the 
Internet and (b) the consideration of the peer-to-peer and 
customer-provider relations between ASs. We are the first 
to exploit the distinctive structure of the Internet graph to- 
gether with utilizing the information provided by the AS re- 
lationships in order to create a tool with the aforementioned 
capabilities. Comparing our generator with the existing ones 
in the literature, the main difference is found on the fact that 
our tool does not try to satisfy specific metrics, but tries to 
remain faithful to the conceptual model of the Internet struc- 
ture. In addition, our approach can lead to (i) the identifi- 
cation of important attributes and patterns in the Internet AS 
topology, as well as, (ii) the extraction of valuable infor- 
mation on the various relationships between ASs and their 
effect on the formulation of the Internet structure. We imple- 
ment our graph generator and we evaluate it using the largest 
and most recent available dataset for the AS topology. Our 
evaluations, clearly show the ability of our tool to capture the 
structural properties of the Internet topology at the AS level 
with high accuracy. Finally, we discuss the potentials of our 
generator not only to reproduce, but also to shrink the input 
graph while maintaining its unique structure and properties. 
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1. INTRODUCTION 

The growing importance, usage and size of the Inter- 
net has increased the interest for the Internet topology 
among the research community; numerous research ven- 
tures have focused on the study of the Intenet graph. 
A thorough understanding of the Internet topology at 
the AS level can contribute significantly to the enhance- 
ment of fundamental applications such as routing and 
information diffusion. In particular, accurate topology 
information is necessary for numerous reasons; (a) sim- 
ulating real networks requires the network topology to 
have been firstly obtained, (b) network management 
descisions are augmented by topology knowledge (e.g. 
finding network bottlenecks, deciding about the place- 
ment of new routers etc) , while (c) topology aware pro- 
tocols can significantly improve the overall performance 
enjoyed from the end users (e.g. QoS routing etc). 

As part of the general effort to effectively describe 
the Internet, various graph metrics have been intro- 
duced that aim to help towards this goal 2 . Applying 
graph mining techniques and evaluating corresponding 
metrics to capture and describe the various properties 
of the Internet can provide us with valuable informa- 
tion for reproducing the topology, as well as evaluating 
the similarity between the original and the generated 
graph. An accurate generator would have numerous 
applications including - but not limited to - the gener- 
ation of artificial graphs for simulations when real data 
is not available, the examination of "what-if" scenarios, 
extrapolations on the evolution of the graph structure 
and also the creation of representative miniature graphs 
in order to reduce the computational cost of simulations 
and graph based procedures. Furthermore, the genera- 
tion process can provide us with useful insights on the 
graph evolution over time and the latent factors that 
affect and drive its evolution [3] helping on descisions 
for network related operations as mentioned above. 

Our objective in this work, is to design and imple- 
ment an accurate graph generation process that can re- 



produce the real Internet AS level topology. Our main 
contributions can be summarized in the following: 

• We identify and study the jellyfish structure of the 
AS level Internet topology. For the purposes of our 
work we use the largest and most detailed snapshot 
of the Internet available in the literature [4]. 

• We design a graph generation process based on the 
conceptual jellyfish model and the AS relations. 

• We implement and thoroughly evaluate our gener- 
ator using various graph metrics that can capture 
the structural properties of a graph. 

Scope of our work: There are many graph gener- 
ators proposed in the literature thus fai0. Nevertheless, 
our work follows a different approach from all the ex- 
isting ones. In particular, the novelty of our generative 
scheme, is mainly related with the exploitation of the 
conceptual model of the real graph. Despite the fact 
that we are mainly focused on the Internet AS graph 
(and its jellyfish - like conceptual model pQ), we would 
like to stress out that our general point of view 
can be applied to any other graph generator. A 
simple, conceptual model is only required, that 
represents the original graph to be reproduced. 

The rest of the paper is organized as follows. In the 
next section we briefly give a background on graph the- 
ory and refer to related work on graph generators. In 
section [3] we present the jellyfish model which forms the 
basis for our approach. Section 0] presents our mea- 
surements on the real Internet graph at the AS level. 
In section [5] we describe our graph generator, while in 
section [5] we present experimental results for the per- 
formance of our approach. Finally, we conclude with a 
brief overview and ideas for future work in Section [7] 

2. BACKGROUNG AND RELATED STUD- 
IES 

In this section we provide a brief background on graph 
theory concepts and related work existing in the litera- 
trure. 

2.1 Graph Metrics 

Every network can be represented through a graph 
G = (V,E). The set V of the vertices, can represent 
various network entities, depending on the abstraction 
level for the graph. For example, each vertex might 
represent a terminal station, a router or even an AS. For 
the purposes of our study each vertex represents an AS. 
The set E of the edges , represents the links between 
the nodes. The graph can be directed or undirected. 
For the case of the AS graph, the direction of an edge 

1 A more detailed discussion on these works is given in Sec- 
tion [2] 



can potentially represent business relations between the 
different ASs (customer-provider, p2p). Metrics used in 
order to describe a graph include: 

Node Degree: The degree of a node, is the number 
of edges incident to the corresponding vertex represent- 
ing the node. For node v, deg(v) is used for refering 
to the node degree. The maximum degree of a graph, 
A(G), is: 



A(G) = max{deg(fc)} 

fcev 



(1) 



while the minimum degree of a graph, 5(G), is given 



by: 



6(G) = udn{deg(k)} 



(2) 



Finally, the average node degree of a graph, D(G), is: 



D(G) = 



\v\ 

i=l 



|V'| 



(3) 



For directed graphs, we can further define the inde- 
gree and outdegree of the node, depending on whether 
the edge is ending at or originating from the node re- 
spectively. 

Node Degree Distribution P(k): The degree 
distribution, is the probability distribution of the node 
degree over the whole network/graph. If we assume 
that there are nk nodes with degree k, then the degree 
distribution, P(k), is simply: 



(4) 



In a similar way, for directed graphs, we can define 
the indegree and outdegree distributions. 

Diameter : Let's denote with d(u,v), the distance 
between nodes u and v, i.e. the number of edges in the 
shortest path connecting vertices u and v. Then, the 
diameter D(G) of the graph G, is given from: 



D(G) = max {d(u,v)} 

u.vGV 



(5) 



From the above definition it is evident that for every 
connected pair (u,v) of nodes, we have that: d(u, v) < 
D(G). Relaxing the above inequality, and having it 
satisfied from 90% of all connected pairs of nodes of 
graph G, we get the definition of the effective diameter. 
Formally, the effective diameter, D e ff(G), is defined as 
the maximum distance in which 90% of all connected 
pairs of nodes can reach each other. It is obvious that 
D eff (G)<D(G)._ 

The above metrics have been extensively used in the 
literature, in order to capture and describe various graph 
properties. 



2.2 Static and Temporal Graph Generators 

To this end, a significant number of graph generators 
have been proposed. Some of these attempts, including 
our own, arc focused on the Internet topology 5j |6J 
while others view the generation process as a general 
graph problem [3j. Based on the generative approach 
followed by the various generators we can further sepa- 
rate them into two groups: 

• Generators that receive as input a static snapshot 
of a graph and attempt to create a duplicate with 
identical properties. The most common static pat- 
terns that these graph generators try to match are 
the degree distribution and the diameter of the 
graph. A lot of work has been done on the study 
of static patterns for various types of graphs. As 
some examples, the authors in [7] [8] [9] [10] [IT] 
study the degree distribution of a large variety of 
graphs that span a huge spectrum, from the In- 
ternet and Web graphs to citation and online so- 
cial networks' graphs. Moreover, [12] [13] take a 
close look on the diameter of the Internet and Web 
graphs as well as that of social networks' graphs. 

• Generators that study the evolution of a given 
graph and attempt to emulate it. A lot of work 
has been done on the temporal evolution of graphs 
and the underlying laws that dictate this evolu- 
tion. The work in [3] is a representative example of 
an evolution-based generator that considers tem- 
poral properties such as the densification power 
law and the shrinking diameter. 

Generators based on the temporal evolution, try to 
track and emulate the gradually development of a graph, 
which constitutes a rather upredictable and unstable 
process. Clearly, this can have a negative effect on the 
accuracy of the generation process. Moreover, focusing 
especically on the Internet graph, the lack of informa- 
tion on the past states and the evolution of the Inter- 
net topology, forms another significant obstacle towards 
adopting an evoluation-based AS level generator. Our 
approach belongs to the first family of generators, and 
is thus unaffected by these problems. 

A process closely related to graph generation, is that 
of graph shrinking. The challenge here is to create a 
smaller graph while maintaining the important prop- 
erties and dominant structure of the original. As a 
result, simulations that were too expensive to run on 
the large original graph can instead be applied to the 
miniature representation. Clearly, generators belong- 
ing in the first category - like ours - can be potentially 
used for graph shrinking. On the other hand, evolution- 
based approaches, will create smaller snapshots with 
properties different than the real current topology. Cap- 
turing evolution properties, these generators will repro- 
duce the graph (smaller in size) as it was in the past, 



when having this smaller size. Clearly this is not what 
we seek for, since many graph properties change with 
time, as alluded above. 

2.3 Graph Generating Models 

The simplest model for a graph generator is the Ran- 
dom Graph Model, proposed by Erdos and Renyi [14]; 
introduced in the early '60s it constitutes the first graph 
generator that was ever presented. With this approach, 
each pair of nodes has the same, independent proba- 
bility of becoming connected through an edge. In other 
words, starting from a set of nodes we add random edges 
between them, with each edge having the same, inde- 
pendent probability. There is clearly a tradeoff between 
simplicity and accuracy in the Random Graph Model, 
since the above procedure is not able to generate graphs 
that match the properties observed at most of the real 
life ones. 

Many recent models for graph generators are based 
on preferential attachement |15] [12] [8] |16] [17j . The 
intuition behind this approach, is that nodes with high 
degrees will attract even more nodes, thus, the rich 
gets richer. Simply put, in these models, the new 
nodes that are added to the graph at each repetition of 
the algorithm, prefer to connect to nodes with a high 
degree. This approach can create graphs with degree 
distributions similar to the ones noticed in real gra phfi 
Finally, graphs generated with preferential attachment, 
tend to exhibit slowly increasing diameters with the car- 
dinality of set V. 

Briefly, other approaches that have been proposed 
for graph generation include, the small-world generator 
[18] and the Waxman generator [19] . Finally, Fabrikant 
et al [5] have studied the general problem of creating a 
graph when resource constraints are existent. 

Most of the above proposed schemes try to match a 
limited set of properties - or even just a single one - 
and thus fail to capture the general conceptual model 
behind the graph structure. To the best of our knowl- 
edge, we are the first that try to exploit the conceptual 
model of the original topology (in our case that of the 
Internet AS graph) towards implementing an accurate 
graph generator. 

3. THE JELLYFISH MODEL 

In this section we will present the conceptual model 
for the Internet topology that we considered. 

Understanding the conceptual underlying structure 
of complex networks is important towards creating a 
generation tool. Measuring and acquiring metric values 
(e.g. degree distribution, diameter etc) can formally 
describe a network graph. Nevertheless, as a further 

2 This is clearly related with the fact that real world graphs 
follow power laws T]i there are a few vertices with high 
node degree and many vertices with low node degree. 



step, it is important to attain a model that can capture 
more visual information for the network. 

Siganos et al [1] have proposed a conceptual model for 
the Internet graph at the AS level. The authors man- 
aged to create an effective conceptual model of the In- 
ternet, which could be easily understood and depicted. 
Furthermore, they showed that their model captures 
the most signicant topological properties of the inter- 
domain level of the Internet. In the following para- 
graphs we will try to provide a brief, yet complete, pre- 
sentation of the jellyfish model. 

The jellyfish model defines a hierarchy of the graph 
nodes (ASs). First, the core of the graph is identified. 
This core is essentially a clique of high degree nodes, 
all connected to each other through peer-to-peer (P2P) 
links. Once the core is identified, the rest of the nodes 
can easily be classified. Thinking of the core as the head 
of the jellyfish, the rest of the nodes are distributed in 
shells that surround the head. The first shell contains 
all the nodes that are adjacent to the core, except the 
one degree nodes. Recursively, every shell contains all 
the nodes (except the one degree nodes) that are con- 
nected through an edge to some node that belongs to 
the previous shell. The one degree nodes are repre- 
sented as hangers, hanging from the shell of the other 
end of their single edge. A visualization of this model 
can be found in pQ. 

By using this conceptual model, the authors in [lj 
identified six layers in the Internet topology, counting 
the core as layer 0. Their observations can be summa- 
rized in the following: 

• Approximately 80-90% of nodes are in the first 3 
layers. 

• Network grows "horizontally", which means that 
the evolution of the graph doesn't result in more 
layers but in larger-denser layers. 

• The topological importance of a shell decreases as 
we move away from the core. 

• Most of the connectivity is towards the center. 
This means that nodes in outer shells needs to 
route through previous shells for most of their short- 
est path connections. 

• The nodes in the first three layers are within 5 
hops from each other. 

This simple and easy even to visualize model, cap- 
tures many important properties of the AS Internet 
topology. In particular, the jellyfish model, can accu- 
rately express the following facts/characteristics of the 
Internet AS graph: 

• The network is compact, as 99% of the nodes are 
within 6 hops; the jellyfish model exhibits the same 
diameter pQ. 



• There exists a highly connected center in the Inter- 
net. This center corresponds to the clique of the 
high degree nodes, defined at the jellyfish model 
as the core. 

• There exists a loose hierarchy at the internet; nodes 
far from center are less important at the jellyfish 
model. 

• One-degree nodes are scattered everywhere; hang- 
ers hang from all jellyfish's shells. 

• The network has the tendency to be one large con- 
nected component (core). 

Among the other contributions of this model, is that 
it forms a benchmark that can be used to evaluate the 
performance of various graph generators. The degree to 
which a generated graph demonstrates the above struc- 
ture can be a testament of how well the graph gener- 
ation tool used can capture the actual Internet graph 
characteristics. 

In this work, we focus on the implementation of a 
generator for the Internet AS topology, able to produce 
graphs that are faithful to the jellyfish model. For the 
purposes of our study we use the most accurate snap- 
shot of the AS topology existing in the literature [4]. 
We start off by identifing the jellyfish structure of this 
snapshot and its specific properties in the next section. 
We then study the extracted structure and use the ob- 
tained information to generate the new graph (Section 

4. THE REAL INTERNET GRAPH 

In this section we present the data and the informa- 
tion obtained from the real Internet topology. 

In order to drive the implementation and test the ac- 
curacy of our graph generator we need a detailed snap- 
shot of the real Internet topology. The latter will help 
us (i) to extract the jellyfish-related parameters of our 
model (e.g. number of shells, size of shells, nodes dis- 
tributions etc) and (ii) to compare the topology created 
from our generator with the actual one. For the needs 
of our work, we used the snapshot collected from Yihua 
He at UCR [4] . To this point this is the largest available 
snapshot of the internet. The graph obtained is part of 
a larger project in which the missing AS links from the 
commonly-used Internet topology snapshots are to be 
identified [4j. This procedure includes cross validation 
of BGP routing tables, Internet Routing Registries and 
traceroute data. An interesting point made is that most 
of the missing peer-to-peer AS links are found to be In- 
ternet Exchange Points (IXP) links. More details for 
the project and the data used can be found in [20] [21] 
[22] [23]. 

Using the above data, we extracted valuable infor- 
mation with regards to the conceptual representation 
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Table 1: The distribution of the nodes and types of edges among the various rings of the jellyfish 
model. 



of the graph as a jellyfish. We observe that approx- 
imately 87-90% of the total number of nodes, belong 
either to the core (we will also refer to the core as ring 
0) or to the first 2 rings (shells), or are hangers orig- 
inating from rings 0-2. Later, during the presentation 
of our graph generator, it will be more clear why this 
information is important. Additionaly, we would like to 
emphasize on the fact that only 9 nodes belong to the 
core of the jellyfish. 
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Table 2: The distribution of the hanger of the 
jellyfish model. 



lar, these tables include values for the input parameters 
needed from our mode{f|. In a nutshell, we present data 
for: 

• The percentages of the nodes that belong to a spe- 
cific ring. 

• The percentages of the nodes that are hangers 
originating from a specific ring. 

• The percentages of P2P and CP (customer-provider) 
edges that exist within each ring. 

• The percentages of P2P and CP edges that belong 
to each bridge. A bridge XY is defined as the set 
of edges connecting nodes from ring X to nodes 
from ring Y. 

In the following section we elaborate on our gener- 
ator. As we show, our model requires a limited 
number of simple parameters so as to capture 
and reproduce the conceptual model of the In- 
ternet topology. 
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Table 3: The distribution of the type of edges 
between nodes at different rings. 

Tables [U - El contain important information for the 
implementation of our graph generator. In particu- 



5. THE MODEL OF OUR GRAPH GENER- 
ATOR 

In this section we will present the steps towards cre- 
ating our graph generator. 

For the purposes of our work, and in order to capture 
the principles of the jellyfish structure of the Internet 
topology, we used a simple deterministic approach. The 
hierarchical structure of our model has three distinct 
components: (i) rings, (ii) bridges and (iii) hangers. In 
the following, a brief explanation is given on how we 
dealt with each of these components in our generator. 

Rings: These are essentially the shells of the jelly- 
fish. We start by identifying the core (clique) , which we 
call ring 0. After that, we populate the remaining rings 
based on the jellyfish model. For each ring, we calcu- 
late the number of nodes (as a percentage of the total) 
and also the percentages of P2P and Customer-Provider 
edges that exist within the ring. 

3 More details for the computation of these parameters from 
the Internet snapshot is given in the following section. 



Bridges: These are the parts that connect the shells 
of the jellyfish. A bridge contains no nodes, but only 
the edges connecting the nodes of two specific rings. 
For each bridge, we calculate the percentages of P2P 
and Customer-Provider edges that belong to the bridge. 
It is entirely possible for a bridge between two specific 
rings to contain zero edges. 

Hangers: These are the hangers of the jellyfish model. 
Each ring is paired with a set of hangers. For example 
hanger-set 1 belongs to ring-1 and contains the 1-degree 
nodes that stem from ring 1. For each hanger-set we 
calculate the number of nodes (as a percentage of the 
total) belonging into it. 

After creating the jellyfish and populating its various 
parts, based on the real Internet topology, the next step 
is to study the rings in more detail and start building 
our own graph. After processing the given snapshot, it 
was clear that the P2P edges within each ring follow 
a power law, with a different coefficient for each ring. 
The coefficient of each ring is calculated and stored. 

The study of CP edges is more complicated, since 
there are constraints that need to be taken into ac- 
count. 

1. Loops need to be considered. A customer of a spe- 
cific node cannot be at the same time its provider 
(or recursively, a provider of its providers). 

2. In order to emphasize to the hierarchical jellyfish 
structure, some nodes have to be moved up in 
the ring hierarchy so as to ensure that a node's 
provider can only exist either in a higher ring or 
in the same ring as the node. 

3. Connectivity needs to be assured. Even though 
this turns out to be trivial after the application of 
our strategy and the above mentioned constraints, 
there could be cases where some extra edges need 
to be added to ensure that the connectivity of the 
structure is maintained. 

With the above constraints in mind, we implement a 
rich gets richer technique that accurately matches the 
actual graph. The important thing to note here is that 
for each ring of the original graph we need to calculate 
and store the pace at which the rich gets richer (co- 
efficient). Finally, a bias is introduced to increase the 
number of nodes with exactly 2 providers, which is the 
common case in the actual snapshot. With the excep- 
tion of nodes with 2 customers, that constitute the vast 
majority of the nodes, the rich gets richer strategy man- 
ages to match the provider distribution without further 
interference. 

A final constraint (constraint 4) is used to control 
the effect of our generation technique. In particular, 
for each ring, an upper bound on the maximum and 
minimum number of CP and P2P edges for a single 



node is placed, based on the respective bounds noted 
in the actual ring. In terms of implementation, this 
means that nodes with number of edges currently below 
the minimum are generally preferred, while nodes that 
reach the upper bound are excluded after they reach 
this maximum. 

Generation Process: Having all the above pa- 
rameteres and constraints in mind, we start building 
our graph by creating the core and the corresponding 
clique using P2P links. We then populate the rings by 
simply adding the respective number of nodes to each 
ring. After the nodes have been added, we add the P2P 
edges within the rings using the calculated power law 
coefficients. The rich gets richer approach is then ap- 
plied to the resulting structure in order to populate the 
bridges with P2P edges. 

The next step is the addition of the CP edges. A 
similar strategy as above is used, filling first the rings 
and then the various bridges. The difference here is 
that there are additional and more complicated con- 
straints to be considered, i.e. constraints TS0. One 
edge is added at a time, as with the P2P edges, but 
only if it does not violate any constraint. If the latter is 
not satisfied, we choose another provider or customer, 
depending on the violation, and try again. 

The model is completed with the addition of the hang- 
ers. The rich get richer technique is also used here, keep- 
ing the upper and lower bounds in mind (constraint 4) . 

In the following section we evaluate the performance 
of our generation proccess. 

6. EVALUATION OF OUR MODEL 

In this section we will present some results from the 
evaluation of our graph generator. 

As mentioned above, some of the main metrics that 
are used for describing properties of the graphs are the 
node degree distribution, the max/min/average node 
degree, the (effective) diameter of the graph as well as 
the clustering coefficient [2J. 

Using existing tools [24], we calculate a set of graph 
metrics for both the actual snapshot of the Internet 
[4], as well as the graph generated from the process 
described in the previous section. Table |4] presents the 
results obtained. 

Both graphs, the Internet snapshot [4] and the graph 
obtained from our generator, are directed. The direc- 
tion of each edge is based on the definition of the ASs re- 
lationships. P2P edges can be considered bidirectional, 
while the direction of CP edges can be defined based 
on whether we view the edge as a citation from a cus- 
tomer to its provider or a service offer from a provider 
to a customer. Nevertheless, we would like to stress 
out that the results presented here refer to the corre- 

4 Note here that for P2P edges, only (the simple) constraint 
4 needs to be regarded. 
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Table 4: A comparison table with the metrics 
computed at both graphs. 

sponding undirected graph [24!]. Consequently, since 
the metrics do not consider direction, we only obtain 
structural information about the graphs. The direction 
of an edge represents the type of the corresponding link 
(P2P/CP). Even though the undirected metrics can not 
be used for evaluating the quality of the reproduction of 
this information, the very nature of our strategy guar- 
antees an accurate generation of the directed edges, at 
least in terms of quantity and distribution. 

Our evaluations presented in Table 51 reveal that the 
calculated metrics for the two graphs have very similar 
values. As a result, the graph that our generator created 
exhibits the same structural properties with the internet 
snapshot we provided to it as input. Furthermore, we 
can show the following proposition. 

Proposition 1. The effective diameter of our graph 
is the same with the one reported by empirical studies 
of the Internet AS topology [25]. 

Proof. As mentioned in [T] there is an upper bound 
on the distance between any node u of layer k u and any 
node w of layer k w . This upper bound is: 

d(u 7 w) — k u + k w + 1 (6) 

Furthermore, we have seen that almost 90% of the 
nodes belong to layer through 2. Thus, 90% of the 
nodes are within 5 hops away from each other. Re- 
call that the above is the definition for the effective di- 
ameter. The graphs generated from our generator are 
therefore guaranteed to exhibit an effective diameter of 
?2, which is also the effective diameter observed for the 
Internet AS level topology [25 . □ 

Finally, Figures [T] and [2] present our results for the 
node degree distribution and the node degree CCDF - 
Complementary Cumulative Distribution Function - for 
both our graph and the original AS topology used. The 
node degree distribution is the plot of the number of 
nodes with a specific degree versus their degree (i.e. the 
histogram of the node degree), while the node degree 
CCDF represents the probability that the degree of a 

5 This fact has also been verified with our evaluations. 



randomly picked node is greater than a certain value. 
What we can observe here is that both metrics provide 
a close match again, in the sense that the general trend 
in both graphs is the same. This further supports the 
fact that our graph generator accurately captures 
the structural properties of the Internet graph at 
the AS level topology. 

7. CONCLUSIONS - FUTURE WORK 

The main goal of our work is to provide a tool, able 
to reproduce with high accuracy the Internet topology 
at the AS level. Unlike previous efforts, we focus on a 
conceptual model for the Internet topology, which can 
effectively capture the various properties of the real AS 
level graph. To reiterate, the main contributions of our 
work are: 

• We identified and studied the jellyfish structure 
using the largest available snapshot of the Internet. 

• We created (designed and implemented) a graph 
generator based on the conceptual model of the 
jellyfish for the Internet and the AS relations. 

• We evaluated our work, by using several metrics 
that can effectively capture the structure of a graph. 

The outcome of our work is very promising. The 
graph that our generator produces exhibits all the im- 
portant features of the conceptual model. Most graph 
generators up to now were focused on a single, specific 
metric, which was thought to capture most of the struc- 
tural information of the input graph. In this work, we 
proposed a different, novel approach, in which the ob- 
jective is to emulate a conceptual model as opposed to 
a specific metric. The novelty of our work can be at- 
tributed to the following two points: 

1. Our generator identifies the jellyfish- like structure 
of the Internet |T| and uses it to construct the back- 
bone of the new graph. 

2. The creation of the new graph is driven by the 
noted relations between the ASs 26J (P2P and CP 
relations). 

Some interesting issues still to be considered can be 
summarized in the following: 

• Up to this point we have only taken into account 
metrics for undirected graphs. It will be of great 
interest to use metrics for directed graphs in or- 
der to further evaluate and improve our generation 
strategy. 

• More sophisticated statistical methods can possi- 
bly be applied in order to further enhance the ac- 
curacy and performance of our generator. 
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(a) The node degree distribution for the given 
snapshot of the Internet. 
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(b) The node degree distribution for the graph 
that we generated. 



Figure 1: The node degree distribution for both the snapshot of the Internet that we had and for 
the graph we generated. 
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(a) The node degree CCDF for the given snap- 
shot of the Internet. 
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(b) The node degree CCDF for the graph that 
we generated. 



Figure 2: The node degree CCDF for both the snapshot of the Internet that we had and for the 
graph we generated. 



• As mentioned in Section |2~21 graph shrinking, is of 
great importance for the research community (e.g. 
minimizing the simulation cost). In the future, we 
seek to examine the capabilities of our generator 
to form an accurate graph shrinking tool as well. 

We opt to address the above issues in our future work, 
creating by this way a complete Internet AS level graph 
generator available to the research community. Finally, 
as a step further, we are interested into applying our 
generic conceptual approach on different types of graphs 
that follow different conceptual models. A higher level 
of abstraction will be thus imposed to our graph gen- 
erator, increasing at the same time its possible applica- 
tions. 
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